Goto

Collaborating Authors

 mention-pair model


Machine Learning for Entity Coreference Resolution: A Retrospective Look at Two Decades of Research

AAAI Conferences

In general, which entity mentions in a text or dialogue refer to the same however, the difficulty of coreference resolution stems from real-world entity. Despite being actively investigated for 50 its reliance on sophisticated knowledge sources and inference years in the natural language processing (NLP) community, mechanisms (Mitkov et al. 2001). Despite its difficulty, it is still far from being solved. To better understand the difficulty coreference resolution is a core task in information extraction: of the task, consider the following sentence: it is the fundamental technology for consolidating the textual information about an entity, which is crucial for essentially The Queen Mother asked Queen Elizabeth II to transform all high-level NLP applications, such as question her sister, Princess Margaret, into a viable answering, text summarization, and machine translation.


Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution

Journal of Artificial Intelligence Research

Traditional learning-based coreference resolvers operate by training the mention-pair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mention-pair model is linguistically rather unappealing and lags far behind the heuristic-based coreference models proposed in the pre-statistical NLP era in terms of sophistication. Two independent lines of recent research have attempted to improve the mention-pair model, one by acquiring the mention-ranking model to rank preceding mentions for a given anaphor, and the other by training the entity-mention model to determine whether a preceding cluster is coreferent with a given mention. We propose a cluster-ranking approach to coreference resolution, which combines the strengths of the mention-ranking model and the entity-mention model, and is therefore theoretically more appealing than both of these models. In addition, we seek to improve cluster rankers via two extensions: (1) lexicalization and (2) incorporating knowledge of anaphoricity by jointly modeling anaphoricity determination and coreference resolution. Experimental results on the ACE data sets demonstrate the superior performance of cluster rankers to competing approaches as well as the effectiveness of our two extensions.